This assignment is for ETC5521 Assignment 1 by Team Possum comprising of Brenwin Ang 31430759 and Joyce Lee 31114229.
Volcanoes date back to hundreds and thousands of years ago where the first documented eruption occurred in Santorini back in 1650 BC! Accompanied by its mysterious image in our eyes, volcanoes touch on a wide range of knowledge, including geochemistry, petrology, seismology and more. This topic seems to be complex and distant from people like us who did not take further studies on this area. Yet we chanced upon data set released by TidyTuesday which is probably one of if not the most comprehensive data set to learn more about volcanoes. This data exploration enable us to take a closer look at the more recent active volcanoes around.
Although volcanoes are nature’s explosive land forms and yet one of the most beautiful, it is inevitable for us, as human being to neglect its impact on our lives. Based on our understanding on volcanoes, we develop two themes that will be surrounding this entire analysis, which are “severity” and “safety” so as to establish more interesting facts!
We start with a brief introduction into the world of volcanoes, then visit the interaction between volcano & humans and finally looking into the characteristics of volcano.
The data source is from The Smithsonian Institution, which has been constantly updating since 2013. The data is cleaned and made available for download on (https://github.com/rfordatascience/tidytuesday/blob/master/data/2020/2020-05-12/readme.md), so that we directly import the 5 data sets for this analysis. These 5 data sets are probably the most comprehensive data set around. They are volcano, eruptions, events, tree_rings and sulfur, allowing us to discover the influences among them.
The data source is from The Smithsonian Institution. The data is available, and cleaned, downloadable from tidytuesday github. Cleaning script is also supplied.
Data provided contains 5 linked data sets, each with looking at a particular aspect of volcano:
volcano: Provides information on 958 volcanoes such as
volcano: Provides location, geological and population information on 958 volcanoes with 26 variables such as
eruptions: Details eruptions occured since Holecene period (11,345 years ago) to present, comprising of 15 variables. Details include
events: any event that occur at the volcanoes documented with 10 variables.
eruptions, thus it seems as a condensed version of eruptions.tree_ring: Tree rings were used as a climate proxy. In study of effects on volcanoes on climate change, researchers matched effects of eruptions to tree ring records. The measurements are conducted yearly and no data is missing.
sulfur
Primary: Does volcanoes with different VEI have specific characteristics?
Secondary: How worried should you be about a volcano eruption? Which tectonic settings have higher or lower VEI and how this relates to the duration of eruption? What is the ideal setting for a volcano to form?
There are 3 main types of Active(erupted before) Volcanoes namely Stratovolcano, Caldera and Shield. We decided to label the others as others. Stratovolcanoes are by far the most active volcano.
We first plotted the world map of Active Volcanoes with tectonic plate boundaries overlaid.
We found that most of the volcanoes lie along the Pacific Ocean tracing the boundaries of tectonic plates. We later found that was dubbed the Ring of Fire(or Circum-Pacific Belt) . This path is approximately 40,000km long and holds 75% of the World’s Volcanoes are situated and 90% of active ones! The abundance of volcanoes are explained by significant tectonic plates in the area since volcanoes are formed by either converging or diverging tectonic plate boundaries, creating cracks in earth.
To achieve the map, we could not just plot the tectonic boundaries as the polygons stretch across the world map (looking like they are at opposite ends of the world when in fact they are side by side since the globe is round while ggplot is 2D). So, we split the tectonic plates at the prime meridian (x-intercept where long = 0) - was assisted by a helpful member Z.Lin at StackOverflow. Also, we joined the datasets with world map provided by ggplot to find continent and country coordinates as well as dataset from Kaggle to obtain coordinates to plot the tectonic plates polygon.
This map zoomed into the Ring Of Fire, showing the volcanoes running along the boundaries. Relatively speaking, the volcanoes here have more eruptions.
With the above visualization, the height of the triangle(volcanoes) corresponds to the number of volcanoes that are within a population category 5km, 10km, 30km and 100km from the volcano’s vicinity respectively.
Majority of the population within 5km reach of a volcano were less than 100. At a 10km range, it was a mix of population sizes. And having a population of more than 100,000 is relatively scarce.
So how far must you be from a erupting volcano to be safe? There is no fixed safety risk zones(or distance) as volcanoes are unpredictable. Generally, around 1km for small/medium sized eruptions(VEI = 1-2) is relatively safe as long as volcanic projectiles do not fall in that area.Volcanic Discovery%20Low%20Risk%20Zone&text=Typically%20you%20are%20more%20than,pyroclastic%20flows%20could%20be%20channeled.) Above figure was inspired by Sil Aarts’ visualization where we use geom_polygon to create easy and intuitive visualizations.
Above shows plot with fill depending on the number of eruptions in each country and larger dot sizes represent the number of population in the vicinity.
Prominent countries(shaded orange) with the most active volcanoes lie on the Ring Of Fire.
VEI measures the explosiveness of volcanic eruptions. It is determined by the volume of materials thrown out(e.g. pyroclastic flow like smoke, ash etc.), eruption cloud height and qualitative observations. As qualitatlively described using terms ranging from “gentle” to “mega-colossal” [Wikipedia]. For example, VEI of 0 given for non-explosive eruptions (\(<10,000m^3\) of fragmental material produced by eruption called tephra). While VEI of 8 can eject \(1\times10^12m^3\) of tephra.
Theoretically, VEI is ranges from 0 to infinity. VEI scaled is on a log scale. Therefore each interval (increase of 1 in VEI) indicate an eruption 10x more powerful. Looking back at the last 132 million years, there were only 40 eruptions documented with VEI-8 magnitude and 10 eruptions of VEI-7 in last 10,000 years.
We filtered the data set to from present to after 1812 - where the last observed VEI 7 was recorded at Mount Tambora and plotted a density plot of VEI against the number of eruptions.
The density plots above shows that likelihood of a volcano with VEI 4 or above is very unlikely (less than 1). In fact, almost 98% of all volcanoes have less than VEI-3.
As for lower, VEI categories, a volcano with VEI-2 occur rather frequently with quite a number of them occurring > 50 times over the time span.
Tiles in above plot are used to visualize frequency with Most Active Volcano filtered in each VEI Category. Each tile represents an eruption.
Eruptions with VEI 2 occurred the most frequently, followed by volcanoes with VEI-3. For VEI-4 and above, volcanoes rarely erupted (or erupted prior to 1812) - Kelul-4 was actually an exceptional case as the other VEI-4 volcanoes occurred very seldom.
The gentler eruptions erupt more frequently while more explosive eruptions occur far less frequently.
The figure first orders the volcano eruptions with long duration ( >365 days) by their tectonic settings. Subduction zone / Continental crust ( >25km) dominates among all tectonic settings with most long duration eruptions both intuitively and proportionally - 90 long duration eruptions, accounting for 58.824% compared with others. Followed by Subduction zone / Continental crust ( >25 km) is Subduction zone / Oceanic crust ( <15km), yet a large gap between them is displayed. Particularly, 14 long duration eruptions occur on Subduction zone / Oceanic crust ( <15km), resulting with 9.15%. The remaining tectonic settings have very close figures, as well as proportion, having Intraplate / Intermediate crust (15-25 km) be the one with least long duration eruptions. In the same way, the number of short duration ( <30 days) eruptions for each tectonic settins is exhibited below. A similarity between this plot and the above one for long duration eruptions is that Subduction zone / Continental crust ( >25 km) and Intraplate / Intermediate crust (15-25 km) still remain in the same position, in this case - short duration eruption. Some variation emerges within the middle range of the order. For instance, there are only 7 long duration eruptions (4.575%) on Rift zone / Oceanic crust ( <15km), however the number of short duration eruptions rises to 30, accounting for 10.067% among all tectonic settings. Moreover, comparing the length of the bars, it is apparent that every number from long duration eruptions for all tectonic settings increase when it comes to short duration. Hence, short duration eruptions occur more frequently than the long ones.
The map plots the locations of the volcanoes, categorised by their duration of eruptions. It can be easily observed that volcanoes with short duration eruptions are more concentrated, so are more likely to form into a “line” shape. On the other hand volcanoes with long durations seem to be more dispersed. Looking at them as a whole, volcanoes are mainly located in coastline areas.
This map captures the volcanoes that have erupted from year to year according to their respective VEI level.
The figure depicts the tectonic settings with the record of at least two or more volcano eruptions within one year. Out of 10 tectonic settings recorded (NAs excluded), only 6 of them are listed under this condition. Tectonic settings with both long and short duration eruptions have all been listed here in regards with high frequency. Subduction zone / Continental crust ( >25km) is again positioned with the most times (21 times) of eruptions of high frequency, but this time followed by Intraplate / Oceanic crust ( <15km), where only 7 high and 15 short duration eruptions are marked. Rift zone / Oceanic crust ( <15km), placing with the least eruptions of high frequency, actually has 30 times of short duration eruptions which is considered quite high.
Grouped by 10 tectonic settings (NAs excluded), volcano eruptions on Rift zone / Intermediate crust (15-25 km) have the highest average VEI of 2.667, whereas those located on Intraplate / Oceanic crust ( <15 km) obtain the lowest average VEI of 0.923. Reflecting on one of the findings, as the tectonic setting where most longer ( >365 days) and shorter duration ( <30 days) of eruptions occur, meanwhile with most higher frequency ( ≥ 2 eruptions within 1 year) of eruptions, Subduction zone / Continental crust (>25 km) does not incur a high average VEI. It instead lies in the middle range among the 10 tectonic settings, with the average VEI of 1.982.
Overall, with the highest VEI average (2.667), neither long duration eruptions ( >365 days), nor high frequency (≥ 2 eruptions in 1 year) eruptions are found in Rift zone / Intermediate crust (15-25 km), while only one eruption of short duration ( <30 days) occurs in this setting. On the opposite, with the lowest ranking for average VEI (0.923), volcanoes on the Intraplate / Oceanic crust ( <15 km) don’t tend to erupt with long duration ( >365 days) where it’s long duration eruption only holds 4.575%, compared with other tectonic settings. Nevertheless, it has a greater percentage for both short duration (5.033%) and high frequency (9.677%) eruptions.
| tectonic_settings | Avg vei |
|---|---|
| Rift zone / Intermediate crust (15-25 km) | 2.667 |
| Intraplate / Continental crust (>25 km) | 2.250 |
| Subduction zone / Intermediate crust (15-25 km) | 2.195 |
| Rift zone / Oceanic crust (< 15 km) | 2.132 |
| Intraplate / Intermediate crust (15-25 km) | 2.091 |
| Subduction zone / Continental crust (>25 km) | 1.982 |
| Subduction zone / Oceanic crust (< 15 km) | 1.796 |
| Rift zone / Continental crust (>25 km) | 1.714 |
| Subduction zone / Crustal thickness unknown | 1.623 |
| Intraplate / Oceanic crust (< 15 km) | 0.923 |
Tree ring is an important indicator to reveal climate and temperature change.
Tree ring z-score and temperature index
| model | r_square |
|---|---|
| lm1(no eruption) | 0.6288653 |
| lm2(eruption) | 0.6289628 |
These packages are used to produce this report:
tidyverse (Wickham et al. 2019), lubridate (Grolemund and Wickham 2011), broom (Robinson, Hayes, and Couch 2020), leaflet (Cheng, Karambelkar, and Xie 2019), ggmap (Kahle and Wickham 2013), mapview(Appelhans et al. 2020), viridis (Garnier 2018), rgdal (Bivand, Keitt, and Rowlingson 2020), kableExtra (Zhu 2020), gridExtra (Auguie 2017), readr (Wickham, Hester, and Francois 2018), knitr (Xie 2014), sf (Pebesma 2018), data.table (Dowle and Srinivasan 2020), ggthemes (Arnold 2019), maps (Richard A. Becker, Ray Brownrigg. Enhancements by Thomas P Minka, and Deckmyn. 2018), ggridges (Wilke 2020), rvest (Wickham 2020), plotly (Sievert 2020)
Appelhans, Tim, Florian Detsch, Christoph Reudenbach, and Stefan Woellauer. 2020. Mapview: Interactive Viewing of Spatial Data in R. https://CRAN.R-project.org/package=mapview.
Arnold, Jeffrey B. 2019. Ggthemes: Extra Themes, Scales and Geoms for ’Ggplot2’. https://CRAN.R-project.org/package=ggthemes.
Auguie, Baptiste. 2017. GridExtra: Miscellaneous Functions for "Grid" Graphics. https://CRAN.R-project.org/package=gridExtra.
Bivand, Roger, Tim Keitt, and Barry Rowlingson. 2020. Rgdal: Bindings for the ’Geospatial’ Data Abstraction Library. https://CRAN.R-project.org/package=rgdal.
Cheng, Joe, Bhaskar Karambelkar, and Yihui Xie. 2019. Leaflet: Create Interactive Web Maps with the Javascript ’Leaflet’ Library. https://CRAN.R-project.org/package=leaflet.
Dowle, Matt, and Arun Srinivasan. 2020. Data.table: Extension of ‘Data.frame‘. https://CRAN.R-project.org/package=data.table.
Garnier, Simon. 2018. Viridis: Default Color Maps from ’Matplotlib’. https://CRAN.R-project.org/package=viridis.
Grolemund, Garrett, and Hadley Wickham. 2011. “Dates and Times Made Easy with lubridate.” Journal of Statistical Software 40 (3): 1–25. http://www.jstatsoft.org/v40/i03/.
Kahle, David, and Hadley Wickham. 2013. “Ggmap: Spatial Visualization with Ggplot2.” The R Journal 5 (1): 144–61. https://journal.r-project.org/archive/2013-1/kahle-wickham.pdf.
Pebesma, Edzer. 2018. “Simple Features for R: Standardized Support for Spatial Vector Data.” The R Journal 10 (1): 439–46. https://doi.org/10.32614/RJ-2018-009.
Richard A. Becker, Original S code by, Allan R. Wilks. R version by Ray Brownrigg. Enhancements by Thomas P Minka, and Alex Deckmyn. 2018. Maps: Draw Geographical Maps. https://CRAN.R-project.org/package=maps.
Robinson, David, Alex Hayes, and Simon Couch. 2020. Broom: Convert Statistical Objects into Tidy Tibbles. https://CRAN.R-project.org/package=broom.
Sievert, Carson. 2020. Interactive Web-Based Data Visualization with R, Plotly, and Shiny. Chapman; Hall/CRC. https://plotly-r.com.
Wickham, Hadley. 2020. Rvest: Easily Harvest (Scrape) Web Pages. https://CRAN.R-project.org/package=rvest.
Wickham, Hadley, Mara Averick, Jennifer Bryan, Winston Chang, Lucy D’Agostino McGowan, Romain François, Garrett Grolemund, et al. 2019. “Welcome to the tidyverse.” Journal of Open Source Software 4 (43): 1686. https://doi.org/10.21105/joss.01686.
Wickham, Hadley, Jim Hester, and Romain Francois. 2018. Readr: Read Rectangular Text Data. https://CRAN.R-project.org/package=readr.
Wilke, Claus O. 2020. Ggridges: Ridgeline Plots in ’Ggplot2’. https://CRAN.R-project.org/package=ggridges.
Xie, Yihui. 2014. “Knitr: A Comprehensive Tool for Reproducible Research in R.” In Implementing Reproducible Computational Research, edited by Victoria Stodden, Friedrich Leisch, and Roger D. Peng. Chapman; Hall/CRC. http://www.crcpress.com/product/isbn/9781466561595.
Zhu, Hao. 2020. KableExtra: Construct Complex Table with ’Kable’ and Pipe Syntax. https://CRAN.R-project.org/package=kableExtra.